Back

To navigate through the report please click on the blue tabs under each heading.

Experimental Details

This compound was screened in the PRISM assay at 8-point dose (3-fold dilution) with a 5 day treatment with 894 cancer cell lines passing QC. Two PRISM cell line collections were used in the assay: PR500 (including only adherent cell lines), and PR300+ (including adherent and suspension cell lines). Three benchmark compounds – included in the MTS Validation set – were also tested at dose to ensure high data quality. All compounds were run in triplicate, and each plate contained positive (Bortezomib, 20uM) and negative (DMSO) controls.

Data

Overview

This section contains QC information for the cell lines included in the assay as well as log-fold change and a link to the dose response data page. For more details on how each metric displayed here is calculated please refer to the Data Processing section above.

Heatmap

The viability values for each of the screened doses are shown in the heatmap below. Please hover over the heatmap to see the fold-change viability values relative to the negative control (DMSO). Blue indicates more killing (sensitivity), while red indicates less.

Viability table

The \(\log_2\) fold-change and viability values for each cell line at each dose are presented in the table below. Please note the same table is also available in Compound Data > Replicate collapsed viability data.

Dose response parameters

For dose response parameters and curves see the dose-response page for this compound.

Lineage enrichment

Overview

Each cell line is represented by its AUC and plotted by lineage (primary disease and sub-type) to visualize lineage-based sensitivity patterns. In particular we calculated effect size, the difference between that lineage and all others, and p-values based on a t-test. Because t-tests are done at each dose and for logAUC and logIC50 (in order to allow for analysis of relationships at particular doses and overall), the same feature may appear multiple times in the table. The dose or curve-level statistic used for correlation is noted in the “Dose” column. For each lineage, the q-values (a corrected significance value accounting for false discovery rate) are computed from p-values using the Benjamini Hochberg algorithm.

Volcano plot

Lineage effects are visualized below on an interactive volcano plot. Please hover the mouse over the points in order to see the gene/feature IDs. Large negative effect size corresponds to increased sensitivity.

Plots by lineage

Mutation effect

Overview

Cell lines are grouped by mutation status to compute mutation-based sensitivity patterns. In the tables and plots “hs” denotes a hotspot mutation, “dam” denotes a damaging mutation, and “other” denotes other missense mutations (for more information see DepMap mutation information. In particular we calculated effect size, the difference between a mutation status and all others, and p-values based on a t-test. Because t-tests are done at each dose and for logAUC and logIC50 (in order to allow for analysis of relationships at particular doses and overall), the same feature may appear multiple times in the table. The dose or curve-level statistic used for correlation is noted in the “Dose” column. For each mutation, q-values (a corrected significance value accounting for false discovery rate) are computed from p-values using the Benjamini-Hochberg algorithm.

Volcano plot

Mutation effects are visualized below on interactive volcano plots. Please hover the mouse over the points in order to see the gene/feature IDs. Large negative effect size corresponds to increased sensitivity. Distributions of AUC values within each mutation are visualized on the next tab, sorted by significance.

Plots by mutation

Correlation analysis

Info

In this section, we explore the univariate associations between the PRISM sensitivity profiles and the genomic features or genetic dependencies. In particular, we compute the Pearson correlations and associated p-values.

On each of the following tabs, the correlations and p-values for log-viability values at each dose, AUC scores and logIC50 values are shown. Because correlations are done at each dose and for AUC and logIC50 (in order to allow for analysis of relationships at particular doses and overall), the same feature may appear multiple times in the table. The dose or curve-level statistic used for correlation is noted in the “Dose” column. For each dataset, the q-values are computed from p-values using the Benjamini-Hochberg algorithm. Associations with q-values above 0.1 are filtered out. Also, the significant genes/features are illustrated on interactive volcano plots. Please hover the mouse over the points in order to see the gene/feature IDs. Data interpretation overviews are included above each plot.

Gene expression

CRISPR knock-out

Micro RNA

Proteomics

Copy number

Metabolomics

shRNA knockdown

Repurposing compounds

Multivariate biomarker analysis

Info

Next, for each logViability at a dose, AUC, and logIC50 we train and fit multivariate models using the molecular characterizations and genetic dependencies of the PRISM cell lines. The resulting importance of various features can be used to suggest potential biomarkers of compound response or to inform potential hypotheses for mechanisms of action. As with the continuous analyses, features may appear in the table or on the plot multiple times for different “doses” (note that dose can also refer to AUC or logIC50).

In particular, we trained random forest models using: i. CCLE features (copy number alterations, RNA expression, mutation status and lineage annotation) ii. CCLE features + reverse phase protein array data (RPPA) + CRISPR + miRNA + metabolomics (MET).

For each model, we reported the cross-validated R-squared values and Pearson scores (the correlation between the model predictions and PRISM profiles) as the model performances. These performances are summarized in the model table below and describe how accurate the model is.

For each feature of each model, the feature importances are computed after normalizing (the sum of the importances is set to 1 in each model) and tabulated along with the accuracy measures.

In each tab of the results section, we plot the R-squared value and the importance of the most important feature of each model. The models with R-squares above 0.2 are considered valid models, and the ones above 0.3 strong models.

Please note that the feature importances are not directional quantities. For the directionality of the genomic features, refer to the univariate correlation analysis.

Results

CCLE model (LIN + GE + MUT + CNA)

Variable importance versus model accuracy (\(R^2\)). High variable importance and high accuracy for a given feature suggests that it is important for predicting sensitivity (at the given dose modeled). Variable importance is directionless for Random Forest models but can be inferred based on univariate analysis results.

Complete model (LIN + GE + MUT + CNA + CRISPR + RPPA + miRNA + MET)

Variable importance versus model accuracy (\(R^2\)). High variable importance and high accuracy for a given feature suggests that it is important for predicting sensitivity (at the given dose modeled). Variable importance is directionless for Random Forest models but can be inferred based on univariate analysis results.